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THE EUCLID-MULLIN GRAPH 


ANDREW R. BOOKER AND SEAN A. IRVINE 

Abstract. We introduce the Euclid-Mullin graph, which encodes all instances of Euclid’s 
proof of the infinitude of primes. We investigate structural properties of the graph both 
theoretically and numerically; in particular, we prove that it is not a tree. 


1. Introduction 

The Euclid-Mullin sequence begins [1,] 2, 3, 7, 43, 13, 53, 5, 6221671, where each term 
is the least prime factor of 1 plus the product of all the preceding terms. As such it can 
be viewed as a computational form of Euclid’s proof that the number of primes is infinite. 
A companion sequence, sometimes referred to as the second Euclid-Mullin sequence takes 
the largest prime factor at each step. These sequences are A000945 and A000946 in the 
OEIS [H]. Both sequences were introduced by Mullin [TT], who asked whether every prime 
occurs in these sequences. Muhin’s question has been answered negatively for the second 
sequence and in fact the second sequence omits inhnitely many primes din]. The question 
for the hrst sequence remains open. 

Here a generalization is considered, where rather than choosing the least or largest prime 
factor at each stage, all prime factors are considered. Since there are now, in general, multiple 
choices for the next element, the result is not a single sequence, but a (directed) graph where 
each path from the root to a node corresponds to a particular sequence of primes. Questions 
asked about Muhin’s sequence can now also be asked about the graph. In particular, does 
the graph contain every prime? If it were ever shown that Muhin’s original sequence contains 
every prime, then the graph would also include every prime. 

The graph admits other structural questions. While the graph is obviously infinite it 
would be interesting to know how the number of nodes grows at each level (or, indeed, to 
determine if it does grow!). As a first step in this direction, this paper establishes that the 
graph is not a tree. 

The directed graph C (Z, Z x Z) consists of a set of integer labelled nodes and edges 
defined by ordered pairs of nodes. can be defined recursively by: n is a node in If 
m is a node in then so are ah of mpi where m + 1 = Y\a=iPT ^ > 0) is the unique 

factorization of m + 1. Further, Gn has directed edges {m,mpi). It is sometimes convenient 
to think of the edge (m, mpi) as being labelled p*. We say, n is the root of the graph and has 
level 0. Any node adjacent to n is said to be level 1. In general, any node reachable by a 
directed path of r edges is said to be level r. In fact, a path of length r represents a product 
of r distinct primes. We call G\ the Euclid-Mullin graph] its first few levels are shown in 
Figure 
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Figure 1. Gi. 

Theorem 1.1. The Euclid-Mullin graph Gi is not a tree. In particular, each of the following 
nodes is connected to 1 by two distinct paths: 

2 ■ 3 ■ 7 ■ 43 ■ 139 ■ 50207 ■ 1607 ■ 38891 ■ 71609249149971437 ■ 104851 

■ 5914302068415095755097398828253214149923 

■ 103 ■ 1750880132687750604376675981842334069 

■ 103451 ■ 193 ■ 22133 ■ 5587528960270206397663051 

■ 73 ■ 5 ■ 13 ■ 593 

and 

2 ■ 3 ■ 7 ■ 43 ■ 139 ■ 50207 ■ 23 ■ 217733 ■ 4024572619121 

■ 539402497343 ■ 72208156847017648587223 ■ 79 

■ 7269452239696911635939429787229069136737446558564286318153183 

■ 8689 ■ 107 ■ 2895777621755988962510175673615781760909999040975810951 

■ 531543631 ■ 73 ■ 5 ■ 13 ■ 593. 

In each case, the order of the prime factors indicates one path, and the other path is obtained 
by swapping 73 and 593. 

Note that the numbers given in the theorem both have level 21. Based on some proba¬ 
bilistic considerations presented in ^ we suspect that any node of lower level is connected 
to 1 by a unique path, but answering this dehnitively is likely to remain infeasible for the 
foreseeable future. 
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2. Multiple /c-tuples of edges 

Given a positive integer n, a path in Gn between n and m = pi ■ ■ ■ pkU can be identihed 
with the fc-tuple of edge primes (pi,... ,Pk)- In this section, we formalize this notion and 
formulate conditions under which nodes may be connected by more than one path. We also 
establish several theoretical results, including the following: 

• For fc = 3, we obtain a complete classihcation of the triples {pi,P 2 ,P 3 ) that form 
one side of a loop in some Gn, given as the prime values of certain polynomials; see 
Theorem 12.51 

• We prove that there is a /c < 13 such that, for any q G Z>o, there are inhnitely many k- 
tuples (pi,..., pfc) that form one side of a loop in some Gn and satisfy {pi - ■ ■Pk,<i) = 1- 
Moreover, any given prime occurs as an edge of a loop of height at most 13 in some 


Gn', see Theorem 2.15 


First, let Vk denote the set of fc-tuples (pi,... ,Pk), where each pi is a prime number and 
Pi 7 ^ pj for i ^ j. The symmetric group Sk acts on Vk by permuting the indices; precisely, 
for TT e Sfc we write 7r.(pi, ...,pk) = {qi, ..., g^), where Pi = q^(^i) for z = 1,..., /c. 

Definition 2.1. Let P = (pi,. . .,Pk),Q = (gi, ■ ■ ■ ,qk) ^ Vk- 

(1) We say that P and Q are equivalent, and write P Q, if there exists vr G 5^ such 
that Q = TT.P and 

Pi ■ ■ ■ Pi-i = gi ■ ■ ■ q^ii)-! (mod Pi) ioT i = l,...,k. 

(2) The multiplicity of P, denoted m{P), is the number of 7i E Sk such that P ~ n.P. 

(3) We say that P is multiple if m{P) > 1. 

(4) We call pi - ■ -pk the modulus of P, and denote it by |P|. 

It is straightforward to verify that ~ dehnes an equivalence relation on Pk- Its relevance 
to the graphs Gn is described by the following key lemma. 

Lemma 2.2. For P = (pi,... ,pfc) G Pk, let N{P) denote the set of positive integers n such 
that n and \P\n are connected in Gn via edges pi,... ,Pk, i-c. 


Pi\n + 1, p 2 I Pin 1, 


Pfc I Pi ■ --pk-in + l. 


Then: 


i.e. 


(1) N{P) is an arithmetic progression modulo |P| 

N{P) = {n G Z>o : n = a (mod |P|)} 

for some a = a{P) G Z relatively prime to |P|. 

(2) Q EPk is equivalent to P if and only if N{Q) = N{P). 





(3) For any n G N{P), the paths in Gn between n and \P\n are in one-to-one correspon¬ 
dence with the equivalence class of P. In particular, the number of such paths is the 
multiplicity m{P). 

Proof. 

(1) The conditions on n can be rephrased as the system of congruences 

n = —1 (mod pi) 
n = —pf^ (mod P 2 ) 


n =-{pi-■-pk-i) ^ {mod pk), 


and the solutions form an arithmetic progression, by the Chinese remainder theorem. 
Since none of the numbers on the right-hand side can be congruent to 0, the elements 
of N{P) lie in an invertible residue class modulo \P\. 

(2) Suppose that P = {pi,... ,pk) and Q = {qi,... ,qk) are equivalent. Then there is a 
permutation n E Sk such that Q = n.P. Choose n G N{P), j G k}, and set 

i = 7i~^{j), so that Pi = qj. Then, 

(2.1) 0 = Pi ■ ■ ■ pi_in + l = qi--- qj_in 1 (mod Pi = qj). 


Since this holds for every j, n is contained in N{Q). Since n was an arbitrary element 
of N{P), this shows that N{P) C N{Q). Applying the argument again with the roles 
of P and Q reversed, we also get N{Q) C N{P), and hence N{P) = N{Q). 

Conversely, suppose that N{P) = N{Q). By part (1), we must have |P| = \Q\, 
and hence there is a permutation n E Sk such that Q = tt.P. Let n E N{P) = N{Q), 
i E {1,..., k}, and set j = vr(i), so that pi = qj. Then again we obtain ( 2.1| ), and 
since n is invertible modulo \P\ = IQI, it follows that 

Pi ■ ■ -pi-i = gi ■ ■ ■ qj-i (mod p* = qj). 


Since this holds for alH, P and Q are equivalent. 

(3) Let P = (pi,...,pfc), n E N{P), and m = \P\n. Suppose that there is a path 
in Gn between n and m via edges qi,... ,qi. Then we have m = qi... qin, so that 
Pi ■ ■ ■ Pk = qi ■ ■ ■ qi- By unique factorization, we have I = k and Q = {qi,... ,qk) E Vk- 
By part (1), N{P) and N{Q) are arithmetic progressions with the same modulus. 
Since they also have a common element n E N{P) fl N{Q), they must be equal. By 
part (2), P and Q are therefore equivalent. Conversely, if P and Q are equivalent 
then N{P) = N{Q), so there is a path in between n and \Q\n = m. 

□ 


Lemma 2.3. There are no multiple k-tuples for k < 3. 

Proof. This is obvious for k = 1. For k = 2, the only non-trivial possibility is that (pi,P 2 ) is 
equivalent to {qi,q 2 ) = (P 2 ,Pi)- Then by Dehnition 2.1 we have 

1 = gi = P 2 (mod pi) 

Pi = 1 (mod P 2 ), 

so that Pi < P2 < Pi, which is impossible. □ 
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2.1. Multiple triples. 


Proposition 2.4. Let P = {pi,P 2 ,P 3 ) ^ 'Ps- Then m{P) > 1 if and only if 
(2.2) P 2 {Pi + Ps) = 1 (mod piPs) and pi = ps (mod P 2 ). 

In this case, m{P) = 2 and the equivalence class of P is {{pi,P2,P3), {P3,P2,Pi)}■ 


Proof. Suppose that P = {pi,P 2 ,P 3 ) is equivalent to Q = {qi,q 2 ,q 3 ) = tt.P for some non¬ 
trivial 71 E S 3 . Since there are no multiple pairs, we must have pi 7 ^ qi and ps 7 ^ gs, so 
7re{(13),(123),(132)}. 

First suppose that tt is a 3-cycle. By reversing the roles of P and Q if necessary, we may 
assume that tt = (123). Then {pi,P 2 ,P 3 ) = {q 2 ,q 3 ,qi), so by Dehnition 2.1 we have 

1 = = P 3 (mod pi) 


Pi = qiq2 = P1P3 (mod P2) 1 =P 3 (mod P2) 

P1P2 = 1 (mod P3). 


Thus, Ps = 1 (mod P 1 P 2 ) and piP 2 = 1 (mod ps), which is impossible. 

The only remaining choice is tt = (13). Then (pi,P 2 ,P 3 ) = {q 3 ,q 2 ,qi), and we have 


1 = qiq2 = P2P3 (mod pi) 
Pi = qi= P 3 (mod P 2 ) 
P1P2 = 1 (mod Pa), 


which is equivalent to the system ( 2 . 2 [ ). Conversely, the steps above are clearly reversible, 
so that any (pi,P 2 ,P 3 ) satisfying (2.2) is equivalent to (p 3 ,P 2 ,Pi)- 

Finally, since (13) is the only non-trivial permutation that can relate equivalent triples, 
any multiple P EV 3 must have m{P) = 2 and equivalence class {P, (13).P}. □ 


Table 2.1 shows the hrst few solutions to (2.2) with pi < ps, ordered by modulus. 


p 

|P| 

a(P) 

(2,3,5) 

30 

19 

(3,2,5) 

30 

29 

(7,5,17) 

595 

237 

(211,197,2969) 

123412423 

114015537 

(601, 577,14449) 

5010580873 

4793484647 

(8191,8101,737281) 

48922495303771 

48372940054709 

(22921,21169,276949) 

134379711825901 

123251758931063 


Table 2.1. Multiple triples P = (pi,P2,P3) with pi < pa 


2.1.1. Integer triples. Let us temporarily drop the restriction that pi, p 2 and pa be prime, 
and consider all solutions to (2.2) in integers. Then it turns out that we can give a complete 
classihcation. In order to state it, we recall that the Fibonacci polynomials Fn{x) are dehned 
by the recurrence 


Fo{x) = 0, Pi(a;) = 1, and Fn{x) = xFn-i{x) + Fn-2{x) for n > 2, 
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generalizing the usual Fibonacci numbers Fn = By convention we extend the dehni- 

tion to negative indices by dehning F_n{x) = Fn{—x) = (— 


Theorem 2.5. Let {pi,P 2 ,P 3 ) £ Then {pi,P 2 ,P 3 ) satisfies (2.2) if and only if one of the 
following holds for some n,x E Z and 6 G 


(2.3) 


{Pl,P2,P3) 


'6{Fn-l{x) + Fn{x), F_nix), Fn{x) + Fn+fix)) 
6{Fn{x), F_n{x) + F_^n+l){x), Fn+l{x)) 

5(1,X, 1) 

^ 5 (a:, 1,1 - a:). 


Proof. The Fibonacci polynomials are given by the following explicit formula: 


xH-\/a;^+4 

2 


x—\/3:^+4 
2 


(2.4) ^ 

Using this one can verify that 

Fn+l{x)Fn-l{x) = Fnixf + (-1)", 

and combined with the recurrence identity Fn+i{x) — Fn-\{x) = xFn{x) we see that if 
{Pl,P2,P3) = S{Fn-i{x) + Fn{x), F_n{x), Fn{x) + Fn+i{x)) 


then 


P2{Pi Tps) = 1 + 
Similarly, we obtain the identity 

Fn+l{xf - 

from which it follows that if 


(- 1 )” ViPs and P 3 -pi = (- 1 )” 
- Fn{xf = xFn{x)Fn+i{x) + (- 1 )", 


then 


{Pl,P2,P3) = 5{Fn{x),F_n{x) + F_(„+i) (x), F„+i(x)) 
P 2 {pi + Ps) = 1 + (-l)”a;piP 3 and - pi = (- 1 )> 2 - 


Thus, in either case, {pi,P 2 ,P 3 ) is a solution to (2.2). The final two solutions are straight¬ 
forward to verify directly. 

Now suppose that {pi,P 2 ,P 3 ) G satisfies ( 2.2[ ), and write 

(2.5) P3 - = qp2, p2{pi Pa) = 1 rpips 

for some q,r E Z. If piP 2 P 3 qr = 0 then it is easy to see that either pip 3 = 1 or p 2 (pi Tps) = 1, 
and all such solutions are described by the third and fourth lines of (2.3). Otherwise q and 
r are uniquely determined and non-zero. 

Next, set 

(2.6) s = r(pi -|- P 3 ) — 2 p 2 and d = {qrfi -|- 4. 


Then d is not a square, and a computation shows that s and p 2 are related by the Pell-type 
equation 

(2.7) — dpi = —4r. 

In other words, is an element of norm —r in the quadratic order O = . (Note 

that O need not be the maximal order in Q(\/d).) 
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If r = ±1 then ( |2.7 ) is just the unit equation for O. It is easy to see that jg ^ 

fundamental unit (of norm —1), so the general solution of (2.7) in this case is given by 

s + p 2 Vd ^ I q + Vd 
-^-= 0 


for 6 G {±1} and n G Z with (—1)"' ^ = r. Thus, 


P2 = 5 


^ q—Vd 


^/d 


= 5Fn{q) and s = 5 


+ \/d 


+ 


— \/~d 


= dLn{q), 


where Ln{x) = Fn+i{x) + Fn-i{x) is the Lucas polynomial. Recalling the dehnition of s, we 
have 

Pi+P 3 = d'{Ln{q) + 2 Fniq)), 

where 6' = (—Together with Ps — Pi = qp 2 = {—^)^~^d'qFn{q), this yields 

„Ln{q) + 2Fn{q) - ,,Ln{q) + 2F„(g) + {-l)^-^qFn{q) 

Pl=i -^-. P3 = i -^-■ 

From the identities 

Ln{x) - xFn{x) = 2Fn-l{x), Ln{x) + xFn{x) = 2Fn+l{x) and = F_n{x), 

we get 

(Pi,P2,P3) = d'{Fn{q) + Fn-i{q),F_n{q),Fn{q) + Fn+i{q)) 

if n is odd, and 

(Pi,P2,P3) = d\Fn{q) + Fn+i{q),F_n{q),Fn{q) + Fn_i{q)) 

= 5'(F_„(-g) + F_„_i(-g),F„(-g),F_„(-g) + F_„+i(-g)) 


if n is even. In either case, this is in the form of the hrst line of (|2.3|). 

2d 

2 


Next suppose that q = ±1. Since ^ 2 +\/d ^ q norm —r, we get a family of solutions 
dehned by 


(2.8) 


s + p 2 \/d r — 2 + y/d ( r + yfd 
- z- = 0 --- 


for 5 G {±1} and n G 2Z. Thus, 


and 


r — 2-\-'/d ( r+V^ ^ _ r —2 —yQ ( r — yfd 


P 2 = S- 


2 I 


yfd 


= 6 


[r - 2 )Fn{r) + L„(r) 


= 5(F.+i(r) - Fn{r)) = 5(F_(„+i)(r) + F_„(r)), 
s = 6 


r — 2 + y/d ( r + y/d \ ^ r — 2 — yfd ( r — yfd 


= 6 


[r - 2 )Ln{r) + dFn{r) 


s + 2 p 2 (r + 2)F„(r) + L„(r) p ^ ^ ^ 

Pi +P 3 = —-— = d -^-= 5(F„+i(r) + F„(r)). 
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Combining this with ps — pi = qp 2 , we obtain 

{pi,P2,P3) = S{Fn{r),F_n{r) + F_(„+i)(r),F„+i(r)) 

if g = 1 and 

(Pi,P2,P3) = (5(F„+i(r),F_„(r) + F_(„+i)(r),F„(r)) 

= <5(F_„_i(-r), F„+i(-r) + F„(-r), F_„(-r)) 


if g = —1. In either case, this is in the form of the second line of (2.3). 

In the case just presented, it is not obvious that we obtain all solutions in this manner, 
but we now proceed to show that this is indeed the case. Let us assume hrst that 4 f r, and 

let a = Q be the O-ideal associated to the pair (s,p 2 )- Then s+p 2 V^ = 0 (mod a), 

and by (2.6) we have s + 2p2 = 0 (mod a). It follows from (2.5) that p 2 is invertible modulo 
r, so we conclude that y/d = 2 (mod a). 

Now if p is an odd prime factor of r, then from (2.7) we see that = 1. Thus, pO splits 

as a product of two prime ideals that are distinguished by the reduction of y/d, i.e. there is 
a unique prime ideal p F O with norm p such that y/d = 2 (mod p). 

If r is even then r = 2 (mod 4), and from (2.6) we see that 4 | s. If g is also even then 
d = A (mod 16), so that — dpi = 12 (mod 16), in contradiction to (2.7). Hence, g must 
be odd and d = 8 (mod 16). It follows that the conductor of O is odd and 2 is ramihed in 
Q{y/d), so there is anyway a unique prime ideal p F O lying above 2. 

In summary, provided that 4 f r, we have shown that r is co-prime to the conductor of 
O and that the prime factors of a are uniquely determined. Therefore, any solution of (2.6) 

r—2-\-y/d 


and (2.7) generates the same ideal as the solution noted above, viz. 
(2.8) describes all solutions. 


O. Hence, 


Next, to handle the case when 4 


r we need to modify the above argument since the 
and 0' = Z[^±^], 


conductor of O is even. In this case we set 

d' = d/A, r' = r/4, s' = s/2 
and we work over O' instead of O. Then 

d! = 4(gr')^ -|- 1 and (s')^ — d'pl = —4r', 

and if a' = j qi iV(a') = \r'\ and = 1 (mod a'). Note that if r' is even 

then d' = 1 (mod 8), so that (T) = 1. Hence, proceeding as above, for each prime p \ r', we 
hnd that there is a unique prime p F O' such that A^(p) = p and = 1 (mod p). Thus, 

the ideal a' is again uniquely determined, so (2.8) describes all solutions. 

It remains only to show that (2.7) admits no solutions if min(|g|, |r|) > 1. For this we 
appeal to the reduction theory of primitive ideals in quadratic orders; see, for instance, [3l 
Chapters 8 and 9] for terminology and fundamental results. When 4 | r, we apply the 
reduction algorithm to see that the cycle of O has length 1; in other words, O is the only 
reduced principal O-ideal. On the other hand, by [21 Prop. 9.1.8], any primitive (P-ideal of 
norm less than y/~d/2 is reduced. Note that if |g| > 2 then 


|r| < 


|gr| 


< 


\/(grp”T4 _ yfd 

2 “ ~Y' 


2 






















Together these imply that if |g|, |r| 7 ^ 1 then there is no primitive, principal O-ideal of norm 
|r|, so (2.7) is not solvable. 

For r divisible by 4, we similarly apply the reduction algorithm to O' and hnd that its 
cycle consists of O' together with the ideals q' of norm \qr'\. In this case we 

have \r'\ < ^y/~d' for every value of g, so there are no primitive, principal O'-ideals of norm 
|r'| if |g|, \r'\ 7 ^ 1 . □ 

2.1.2. Prime triples. We now return to the prime case. Clearly the third and fourth lines 
of (2.3) never yield primes, and since the sum of the entries of the second line is even, the 
only (positive) prime solutions that it yields are permutations of (2,3,5). As for the hrst 
line, note that Fn{x) is irreducible only if |n| is prime [lO]. If we take pi < ps, then we may 
assume that n is an odd prime, x is positive, and 5 = 1. 

In particular, with n = 3 we get the solutions 

{Pl,P 2 ,P 3 ) = {x"^ + X + l,x‘^ + l,x^ Px"^ + 2x + 1 ). 

By standard conjectures (Schinzel’s Hypothesis), we expect that these polynomials are si¬ 
multaneously prime for inhnitely many values of a: > 0 , and that motivates the following 
conjecture. 

Conjecture 2.6. There are infinitely many P G P 3 with m{P) > 1. 

In fact, it is natural to expect triples of primes to occur with probability proportional to 
(loga;)“^, so there should be a constant c > 0 such that 

#{P G P 3 : m[P) > 1 and |P| < X} = (c + o(l)) 7 ^-— as X ^ cx). 

(logX)-^ 

Such a statement seems far from what can be proven with present technology, but we are 
able to obtain somewhat weaker results in Section 12.31 below. 


2.2. Multiple quadruples. In this section we compute the systems of congruences giving 
rise to multiple quadruples of edge primes, analogous to Proposition 2A in the case of 
triples. Note hrst that if (pi,P 2 ,P 3 ) ~ {P 3 ,P 2 ,Pi) G P 3 is a multiple triple, then clearly 
{Po,Pi,P 2 ,P 3 ) ~ {Po,P3,P2,Pi) and {pi,P2,P3,P4) ~ {P3,P2,Pi,P4) are multiple quadruples for 
any suitable choice of po or p 4 . More interesting are the solutions giving rise to loops of 
height 4 in the graph. More generally, we will be interested in pairs P = {pi, ... ,Pk), Q = 
(gi ,... ,qk) G Vk dehning paths in that meet only at n and |P|n = \Q\n, so that they 
form a loop of height k; that is the content of the following dehnition. 


Definition 2.7. Let P = (pi,... ,Pk), Q = {qi, ■ ■ ■ ,qk) ^Vk- We say that the pair (P, Q) G 
n is irreducible if P ^ Q, P ^ Q and 

Pi - ■ -Pi^ qi - ■ ■ qi forO <i < k. 

Remark 2 . 8 . Note that {P,Q) is irreducible if and only if {Q,P) is irreducible, so we may 
regard the pair as unordered. 


Next, we observe that any equivalence P ~ Q gives rise to another equivalence, as follows. 


Lemma 2.9. Let P G Vk, and suppose that P is equivalent to Q = tt.P for some tt G S'^. 
Let (T = (1 k) (2 — 1 ) ■ ■ ■ (L|J k + 1 — E Sk be the permutation that reverses the 

order of indices, and put P = a.P, Q = a.Q. Then: 
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(1) P is equivalent to Q = ana.P; 

(2) P, Q, P and Q all have the same multiplicity. 

Proof. Suppose that P = (pi,... ,pk) and Q = (gi,..., qk). Then 

Pi ■ ■ ■ pi_i = gi ■ ■ ■ qj-i (mod p*) 

whenever pi = qj. Note that we also have \P\ = |Q|, and cancelling the common factor of 
Pi = qj yields 

(pi ■ ■ ■Pi_i)(pi+i ■ ■ - Pfc) = (gi ■ ■ ■ qj-i){qj+i ■■■qk)- 

Dividing this equality by the above congruence, we obtain 

Pi+i ■■■Pk^ qj+i ■■■qk (mod p*). 

Thus, (pfc,... ,pi) is equivalent to {qk, ■ ■ ■, qi), as desired. 

For the second assertion, P and Q clearly have the same multiplicity since they are equiv¬ 
alent, and likewise for P and Q, so it is enough to show that m{P) = m{P). But by the hrst 
assertion, P is equivalent to Q if and only if P = a.P is equivalent to Q = a.Q, so a dehnes 
a bijection between the equivalence classes of P and P. □ 


Proposition 2.10. Let (pi,P 2 ,P 3 ,P 4 ) G P 4 . If the conditions listed in the middle column of 
the following table are satisfied in any one case, then each of the corresponding quadruples in 
the right column has multiplicity 2, with equivalence classes as indicated. Conversely, every 
multiple quadruple has multiplicity 2, and every irreducible pair of multiple quadruples occurs 
in the table for a unique choice of (pi,P 2 ,P 3 ,P 4 )- 


Case I 

P4 = 1 (mod pi) 
Pz{PiP2+Pi) = 1 (mod P2P4) 

P2 = P4 (mod P3) 

{{Pi,P2,P3,Pa), {Pa,Pi,P3,P2)} 
{{Pa,P3,P2,Pi), {P2,P3,Pi,Pa)} 

Case II 

Pi < P2 

Pii{PiP2+Pi) = 1 (mod P1P2P4) 
P1P2 = Pa (mod pa) 

{{PuP2,P3,Pa), {Pa,P3,PuP2)} 
{{Pa,P3,P2,Pi), {P2,Pi,P3,Pa)} 

Case III 

Pi < Pa, P2 < Pz 
{Pi + Pa)P 2 Pz = 1 (mod P1P4) 

Pi = p4 (mod P2P3) 

{{PuP2,P3,Pa), {Pa,P2,P3,Pi)} 
{{Pa,P3,P2,Pi), {Pi,P3,P2,Pa)} 

Case IV 

Pi < P4 

{Pi + Pa)P2P3 = 1 (mod P1P4) 

Pi = PzPa (mod P2) 

P1P2 = Pa (mod pa) 

{{PuP2,P3,Pa), {Pa,P3,P2,Pi)} 


Remarks 2.11. 

(1) Note that the non-trivial permutations o f P = (pi,P 2 ,P 3 ,P 4 ) appearing in the table 
are those labelled Q, P and Q in Lemma 2^ they are all distinct except in Case IV, 
where we have Q = P and Q = P. 

(2) The proposition asserts that a given quadruple cannot appear on the right-hand side 
of the table more than once, and that there are never more than two paths in Gn 
between n and PiP 2 P^PLn. However, it can happen that different permutations of 
(pi,P 2 ,P 3 ,P 4 ) arise from different cases in the table or from the same case multiple 
times; for instance, eight permutations of (2,3,11,13) give rise to quadruples with 
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multiplicity 2, and they arise once in Case I and twice in Case IV. This is not a 
contradiction because the sets N{P) and N{P') are disjoint for inequivalent permu¬ 
tations P and P', and thus the corresponding paths cannot emerge together from the 
same node. 

(3) We will see below that solutions exist in each of the Cases I-IV. 


Proof. Let P = {pi,P2,P3,Pi), Q = {Qi,Q2,(13,Q4), and suppose that {P,Q) G Vf form an 
irreducible pair. Then there is a non-trivial permutation tt & such that P is equivalent to 
Q = TT.P. Since (P, Q) is irreducible, vr cannot stabilize any of the sets {!}, {1, 2} or {1,2, 3}. 
Moreover, by Lemma |2^ the solutions for a given tt are in one-to-one correspondence with 


those for 7r“^, ana and avr^^cr, where a = (14)(23), so we may group those permutations 
together into classes and consider the solutions for only one permutation from each class. 
With some straightforward computations in 5*4, we hnd that there are seven classes: 

{(1234), (1432)}, {(1243),(1342)},{(13)(24)}, 

{(124),(142), (134), (143)}, {(1324), (1423)},{(14)},{(14)(23)}. 


(2.9) 


The hrst three turn out to yield no solutions, while the last four correspond to the four cases 
in the table. We consider each class in turn and take tt to be the hrst element listed in each 
case. 


TT = (1234): Then (pi,P 2 ,P 3 ,P 4 ) = (^ 2 , gs, ^ 4 , gi), and we have 

1 = gi = P4 (mod pi) 

Pi = qiq 2 = PiPa (mod P 2 ) 1=Pa (mod P 2 ) 
P1P2 = gig2g3 = PiP2Pi (mod p^) ^1= p^ (mod P3) 
P1P2P3 = 1 (mod p4)- 


Thus, we have both p4 = 1 (mod P 1 P 2 P 3 ) and P 1 P 2 P 3 = 1 (mod pa), which is impossible. 
TT = (1243): Then {pi,P 2 ,P 3 ,P^) = (g2, g4, gi, g3), and we have 
1 = gi =P3 (mod pi) 

Pi = gig2g3 = pmp^ (mod P 2 ) 1 = P3P4 (mod P 2 ) 

P 1 P 2 = 1 (mod ps) 

P 1 P 2 P 3 = gig2 = P 1 P 3 (mod Pa) P2 = 1 (mod Pa). 


Thus, Pi divides 1—Ps, andp2 = (mod ps). Note that applying the permutation (14)(23) 

to the indices leaves the system unchanged, so we may assume without loss of generality that 
P 2 < P3. Therefore, P 2 = P 3 + whence 


P3 = 


P3 - 1 

Pi 


(mod P 2 ) P 1 P 3 = P3 - 1 (mod P 2 ). 


Since we also have p^Pa = 1 (mod P2), this implies that pi -|- p4 = 1 (mod ^2)- 
Now, if P2 < 5 then we must have p 2 = 3, p4 = 2, so pi > 3 and p 2 < 2pi — 3. On the 
other hand, if P2 > 5 then P2 > 1 + 2p4, and 
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Since P 2 = P 3 + this implies that ps < < 2pi. Hence, ps = pi + 1, so that 

Pi = 2, p 3 = 3. But then p 2 < 2pi — 3 = 1, which is impossible. 

TT = (13)(24): Then (pi,P 2 ,P 3 ,P 4 ) = (^ 3 , ^ 4 , <?!, ^ 2 ) and we have 


1 = gig2 = P 3 P 4 (mod pi) 

Pi = qiq2q3 = PmPi (mod P2) ^ l = p^p^ (mod P2) 
P1P2 = 1 (mod ps) 

P 1 P 2 P 3 = qi=P3 (mod P4) P1P2 = 1 (mod P4). 


Thus, we have both P 3 P 4 = 1 (mod P 1 P 2 ) and pip 2 = 1 (mod P 3 P 4 ), which is impossible. 

TT = (124): Then (pi,P 2 ,P 3 ,P 4 ) = (^ 2 , ^ 4 , ^ 3 , gi), and we have 
1 = gi = P4 (mod pi) 

Pi = qiq2q3 = P1P3P4 (mod P2) ^ 1 = P3P4 (mod P2) 

P1P2 = qiq2 = pm (mod p^) ^ P2 = P4 (mod P3) 

P1P2P3 = 1 (mod P4), 

which is equivalent to the set of conditions in Case I. The equivalence classes in the right-hand 
column are {P, n.P}, {cr.P, an.P}. 

71 = (1324): Then (pi,P 2 ,P 3 ,P 4 ) = (^ 3 , ^ 4 , ^ 2 , gi), and we have 

1 = gig 2 = P 3 P 4 (mod Pi) 

Pi = qmqs = PmPi (mod P2) ^ l = p^p^ (mod P2) 

P1P2 = gi = P4 (mod P3) 

P1P2P3 = 1 (mod P4), 


which is equivalent to the system of congruences in Case II. In this case, the system is 
invariant under the action of (12) = air, but the normalization condition pi < p 2 ensures 
that each set of solutions {P, tt.P}, {a.P, an.P} is counted only once. 

TT = (14): Then (pi,P 2 ,P 3 ,P 4 ) = (g 4 , g 2 , g 3 , gi), and we have 


1 = gig2g3 = P 2 P 3 P 4 (mod pi) 

Pi = gi = P4 (mod P 2 ) 

P 1 P 2 = gig 2 = P 2 P 4 (mod P 3 ) ^ P 2 = P4 (mod P 3 ) 
P 1 P 2 P 3 = 1 (mod P 4 ), 


which is equivalent to the system of congruences in Case III. In this case, the system is 
invariant under both a and tt, but the normalization conditions pi < p 4 and p 2 < ps ensure 
that each set of solutions {P, tt.P}, {a.P, an.P} is counted only once. 
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TT = (14)(23): Then {pi,P 2 ,P 3 ,Pa) = (^ 4 , gs, <? 2 , gi), and we have 

1 = gig2g3 = P 2 P 3 PA (mod pi) 

Pi = gig 2 = PzPa (mod P2) 

P 1 P 2 = gi = P4 (mod P3) 

P 1 P 2 P 3 = 1 (mod P 4 ), 

which is equivalent to the system of congruences in Case IV. In this case, we have tt = a, so 
we get only one equivalence class of solutions. The system is also invariant under tt = a, but 
the normalization condition pi < p^ ensures that each set of solutions {P, tt.P} is counted 
only once. 


Conversely, it is easy to see that the logic is reversible in the last four cases considered, 
so any {pi,P 2 ,P 3 ,Pa) £ 'Pa satisfying one of the given sets of conditions gives rise to multiple 
quadruples as indicated. 

It remains to prove the assertion that the multiplicity is 2 in each case. Suppose that P is 
equivalent to both Q = tt.P and Q' = tt' .P for some non-trivial tt 7 ^ tt' . Then Q is equivalent 
to Q' = tt'tt~^.Q. Hence, vr, tt' and tt'tt~^ are all contained in the union 


{(124), (142), (134), (143), (1324), (1423), (14), (14)(23), (13), (24)} 


of the last four classes in (2.9), together with the permutations giving rise to multiple triples 


{P 11 P 21 P 3 ) or {p 2 ,P 3 ,Pa)- Note that we are free to replace P,Q,Q' by a.P,a.Q,a.Q' or to 
permute them arbitrarily, which is to say that we can replace (vr,7r') by any of the pairs 

(7r,7r'), (7r',7r), (tt~^,tt'tt~^), {tt'tt~^,tt~^), {tt'~^, tttt'~^) or (tttt'~^,tt'~^), 


or their conjugates by a. Going through all possibilities, we hnd that we may assume that 
(TT, vr') G {((124), (142)), ((13), (124)), ((13), (134))}. 

We consider these three cases in turn. 


vr = (124), tt' = (142): Recall that vr = (124) leads to the system in Case I. For vr' = (142) 
and Q' = (g(, gs, gs, g^), we have {pi,P 2 ,P 3 ,Pa) = {q'i, q'l, q 3 , Q 2 ), so that Pi = 1 (mod pa) and 
P 1 P 2 P 3 = g'l = P 2 (mod P 4 ). Hence, pa = 1 (mod P 4 ), and we also have p 4 = 1 (mod pi), so 
that P 4 < Pa < Pi < P 4 , which is impossible. 


vr = (13), vr' 
Thus we have 


(124): Then (pi,Pa,P 3 ,P 4 ) satishes the system in Case I as well as (2.2). 


1 = P4 = P 2 P 3 PA (mod pi) 

1 = P3Pa = PiPa (mod pa) 

1 = P1P2 = P1P4 (mod ps), 

so that P4(pi +P2P3) = 1 (mod PiPaPs)- Also, P1P2P3 = 1 (mod P4), so that p4 = 

some t G (0,pipaP3) fl Z. Substituting for p 4 , we have t = —pi — paps (mod P 1 P 2 P 3 ), whence 

t = pipaPs - Pi - PaP3. Thus, 

Pa{PiP2P3 - Pi- P 2 P 3 ) = P 1 P 2 P 3 - 1, 


P4Pi - 1 = ((P4 - l)pi - Pa)P2P3 > 6[(p4 - l)pi - P4]- 

13 


which implies 





Hence pi < If P 4 > 3 then this gives pi < 2, while if p 4 = 2 then 2 < < 3, but both 

of these are impossible. 


vr = (13), tt' = (134): We have Q' = {q'i,q 2 ,q 3 ,qi) 
we get 


(P 4 ,P 2 ,Pi,P 3 ), and in view of ( 2 . 2 ) 


1 = qW 2 = P 2 Pi (mod pi) ^ p 4 = p-^ = p 3 (mod pi) 
Pi=qi= Pa (mod P 2 ) Pa = Pa (mod P 2 ) 

P 1 P 2 = q'lqWs = P 1 P 2 PA (mod Pa) p4 = 1 = pip2 (mod ps) 
P 1 P 2 P 3 = 1 (mod P 4 ). 


Hence ^4 = ^ 3 + pip 2 (mod P1P2P3) and Pa < P1P2P3, so that P4 = Pa + PiP2- By parity 
considerations we see that at least one of pi, p 2 and ps must be 2 , and it follows from 
Theorem 2.5 that (pi,P 2 ,P 3 ) is a permutation of (2,3,5). Therefore, P 1 P 2 P 3 — 1 = 29 is 
prime, so that p 4 = P1P2P3 — 1 > pa + P1P2, which is a contradiction. 


Finally, suppose that a quadruple P occurs in the table for two different choices of 
(Pi)P 2 ,P 3 ,P 4 )- Then, by the above argument, in both instances P must be related to the 
other element of its equivalence class by the same permutation. Thus, either P appears 
once in each equivalence class in Case H or Case HI, or twice in Case IV. However, the 
normalization conditions rule out all of these possibilities. □ 


Table 2^ shows the hrst several solutions to the conditions in Proposition 2.10 ordered 
by modulus. 


2.3. Multiple fc-tuples for large k. The alert reader will note that the congruence con¬ 
straints in Cases H and HI of Proposition 2.10 are nothing but (2.2) with (pi,P 2 ,P 3 ) replaced 


by (piP 2 ,P 3 ,P 4 ) or (pi,P 2 P 3 ,P 4 ); in particular, the solutions are parametrized by Theorem 2.5 


This turns out to be a general phenomenon, in the sense that the system of congruences aris¬ 
ing from a given element of Sk can be embedded in a system for any K > k hy grouping the 
primes into products, as the following lemma shows. 


Lemma 2.12. For i = 1,... ,k, let Pi > 1 be an integer with prime factors pij for j = 
1,... ,ri, and assume that Pi - ■■ Pk is squarefree. Put K = ri + ... + rk, and set 

P — (Pll) • • • ) Plri) • • • ) Pkly • • • ) Pkrif) ^ Pk- 
Suppose that n E Sk is a non-trivial permutation such that 

Pi--- Pi-i = Qi--- (mod Pi) for i = I,... ,k, 

where (Qi,..., Qk) = 7r.(Pi,..., Pk). Then there is a non-trivial permutation H G Sk such 
that P ~ H.P. Further, the pair (P, H.P) is irreducible if and only if 

Pi - - - Pi ^ Qi - - - Qi for 0 < i < k. 


Remark 2.13. Note that the order of the prime factors of P* is not specihed, so each solution 
(Pi,..., Pk) gives rise to IliLi p! multiple iP-tuples. 

Proof. The main idea is to apply tt to the blocks of indices of length r*. More formally, for 
i = 1,..., /c -|- 1, let Sj = ri rj_i and ti = r,r-i(i) + ... + Note that s* + j is 

the index of the jth prime factor of Pi in P. Given I G {l,...,iC}we dehne n(/) = pp) +j, 
where i G {1,..., /c} and j G {1,..., fj} are the unique indices for which I = Si j. 
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(ffi,P2,P3,P4) 

|P| 

a(P) 

case 

(2, 5, 7, 3) 

210 

107, 149 

II 

(3,13,2,7) 

546 

181, 251 

I 

(3,2,11,13) 

858 

467, 779 

I 

(11,3,2,13) 

858 

571 

IV 

(13,3,2,11) 

858 

857 

IV 

(3,19,11,2) 

1254 

1127 

IV 

(7,3,2,41) 

1722 

1721 

IV 

(41,3,2,7) 

1722 

1147 

IV 

(41,7,2,3) 

1722 

491 

IV 

(41,7,3,2) 

1722 

1639 

IV 

(5,29,2,17) 

4930 

3909 

IV 

(13,2,5,43) 

5590 

3353, 5589 

III 

(2,3,31,37) 

6882 

1183, 5771 

II 

(3,7,17,89) 

31773 

22427, 26966 

II 

(103,31,2,5) 

31930 

5149 

IV 

(7, 23,2,107) 

34454 

29959 

IV 

(3,17,31,79) 

124899 

81764, 81922 

I 

(41,17,2,199) 

277406 

32635 

IV 

(5,53,37,43) 

421615 

39559,173203 

II 

(73,5,13,593) 

2813785 

1125513,1861426 

III 

(449,67,2,191) 

11491706 

6517683 

IV 

(241,2,113,3631) 

197766046 

183764909, 42003407 

III 

(2,3541,997,103) 

727257662 

714062125 

IV 

(23,367,401,421) 

1425018061 

418499259, 1226476565 

II 


Table 2.2. Multiple quadruples of small modulus 


Note that n(/) = + j < so If maps to itself. To see 

that it defines an element of Sk, it suffices to show that it is surjective. To that end, given 
any / G choose i to be the largest positive integer such that ti < /, and set 

j = I — ti > 0. Then ti + r^-ip) = fj+i > /, so j < r^r-ip). Hence / = n(s^-ip) + j), as 
required. 

We must show that P is equivalent to H.P. Let ui,... ,uk denote the entries of P and 
Vi,... ,Vk the entries of H.P. Given / G {1,...,P}, let / = Si + j for i G {1, ..., /c} and 
j G {1,.. Then 

Ml ■ ■ ■ ui-i = n n 

\i'=i / \i'=i 

Since ui \ Pi and Usi+j' = for j' = 1,..., j, this is congruent modulo ui = Mn(/) to 

7r(i)-l \ / j-l \ 

n <5^' n ^Ui)+r = ■ ■ ■ vn{i)-i- 

i'=l j \j'=l / 



Since / was arbitrary, P ~ H.P. 
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As for the final claim, if (P, fl.P) is not irredncible then = vi...vi for some 

/ G (0, P) n Z. If J < ri then by dehnition we have n(/) = + J. Since uj divides 

Vi - ■ ■ vj, we also have n(J) < I. Thus f^(i) = 0, which implies 7r(l) = 1 and Pi = Qi. Hence 
we may assume that / > ri. 

Let i < k he the largest positive integer such that / > Sj+i, and i' < k the largest non¬ 
negative integer such that / > It follows that Uf ■ - uj is divisible by Pi,... ,Pj but 

not by Pj for any j > i. Similarly, ui ■ ■ ■ u/ is divisible hy Qi... ,Qii, but not by Qj for any 
j > i'. Since Pj = QttQ) for every j, it follows that vr is a bijection between {1,... ,-i} and 
{1,..., i'}] hence i' = i and tt stabilizes i}. In particular. Pi ■ ■ ■ P* = Qi ■ ■ ■ Qi. 

Conversely, suppose that Pi ■ ■ ■ Pj = Qi ■ ■ ■ Qj for some i G (0, k) fl Z. We have Pi ■ ■ ■ Pj = 
Ui - ■ -uj and Qi ■ ■ ■ Qi = Vi ■ ■ ■ vp for some I, P G (0, P) flZ. By unique factorization, / = P, 
and thus (P, H.P) is not irreducible. □ 


In the following we let denote the set of squarefree integers with at most r prime factors, 
and Too = U^o the set of all squarefree integers. 


Lemma 2.14. Let f{x) = a;^-|-2a;-|-l) and g{x) = xlx"^—x+l){x‘^ + l). 

Then, for any q G Z>o and all sufficiently large X > 0 (with the meaning of “sufficiently 
large” possibly depending on q), we have 


(1) #{a;GZn[l,X]:(/(a;),g) 

(2) #{a;GZn[l,X]:(/(a;),g) 

(3) #{xGZn[l,X]:(^(a;),g2) 

(4) ff{xeZn[l,X]-.{g{x),q^) 


= 1 and f{x) G Too} S>g X; 

= 1 and fix) G Tia} 

= q and q~^g{x) G Too} X; 
= q and q~^g{x) G Tia} >5 


Proof. Let h G Z[a;] be a squarefree polynomial with k irreducible factors and content 1, and 
suppose that there exists a G Z such that p \ h{a) for every prime p < degh. Then it was 
shown in [2] that if every irreducible factor of h has degree at most 3 then there are positive 
numbers c = c{h) and r = r(/c, degh) such that 

if{x G Z n [1, X] : h{x) G Too} = (c -f o(l))X as X —)■ cx). 


and 

if{x G Z n [1, X] : h{x) G T^} (\ogX)’^ ^ 

Further, for k = 3 and degh = 7 we may take r = 13. Thus, (1) and (2) follow on applying 
these results to h{x) = f{qx). 

For (3) and (4) we set Q = lcm(g, 2) and take h{x) = Q~^g{Q + Q‘^x). Then h G Z[x], 
and if a G Z is such that 

Qa = -1 (mod ^), 

then {h{a), 30) = 1. From |2] we hnd that r = 11 is admissible for h, from which (3) and (4) 
follow. □ 


Theorem 2.15. 

(1) For any q G Z> 0 ; there are infinitely many positive integers k such that P| contains 
an irreducible pair of modulus co-prime to q. 

(2) There is a positive integer k < 13 such that, for any q G Z>o, P^ contains infinitely 
many irreducible pairs of modulus co-prime to q. 
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(3) For any squarefree q G Z>o, there are infinitely many positive integers k such that 
Vl contains an irreducible pair of modulus divisible by q, and the least such k is at 


most uj{q) 
Remark 2.16. 


12 . 


Combining (1) and (2) with Lemma 2.2 and the Chinese remainder theorem, we see 
that if a,q,k G Z>o, then for a positive proportion of the numbers n = a (mod q), 
Gn contains both a loop of height < 13 and a loop of height > k. If (a, q) = I then 
the same assertion holds with n restricted to primes. 

Similarly, by (3), for any squarefree q G Z>o there is a prime n such that Gn contains 
a loop of height < uj{q) +12 that has every prime factor of q as an edge. In particular, 
every prime occurs as an edge of a loop in some Gn- 


Proof. Let f{x) be as in Lemma 2.14 Suppose that f{x) is squarefree for some x G 
and put 

(Pi, P 2 , P 3 ) = (x^ + a; + 1, + 1, x^ Px"^ + 2x + 1). 




Then the Pi are squarefree and pairwise co-prime. By Theorem 2.5, (Pi,P 2 ,P 3 ) satisfies 
(2.2), and applying Lemma 2.12 with tt = (13), we obtain an irreducible pair (P, LI.P) G 


where |P| = f{x) and K = u{f{x)). (Recall that u{n) denotes the number of distinct prime 
factors of u.) 

Now, to prove (1), we construct a sequence of positive integers Xi as follows. Assume that 
Xi,, Xi-i have been chosen, and set 


r = 


0 

u:{f{xi-i)) 


if i = 1 , 
if i > 1 . 


It was shown by Halberstam [8] that, for any irreducible polynomial h G Z[x], ^ 

has a Gaussian distribution, as in the Erdos-Kac theorem. Taking h to be one of the 
irreducible factors of /, we have in particular that 

4f{x G Z n [1, X] : f{x) G Tj.} < jf{x G Z fl [1, X] : h{x) G T^} = o(X) as X —>■ 00 . 

Thus, by part (1) of Lemma [2.14| we may choose Xi G Z>o such that {f{xi), q) = 1, /(x*) is 
squarefree and u{f{xi)) > r. 

Hence, for the sequence of Xi thus constructed, u){f{xi)) is strictly increasing. By the 
above, for each i, contains an irreducible pair of modulus f{xi), and ( 1 ) follows. 

Turning to (2), suppose that there is no such k. Then for each fc = 1,..., 13, there exists 
qk G Z>o such that contains at most finitely many irreducible pairs of modulus co-prime 
to qk, and replacing qk by a suitable multiple if necessary, we may assume that there are no 


such pairs. Applying part (2) of Lemma 2.14 with q = qi - ■ ■ qis, there exists x G Z>o such 


that f{x) G Ti 3 and (/(x),g) = 1. By the above construction, we obtain an irreducible pair 
(P, n.P) G V]^ of modulus co-prime to q, where K = u{f{x)) < 13. This is a contradiction, 
and ( 2 ) follows. 

Finally, (3) is proved in much the same way using the triple 

(Pl,P2,P3) = (X,X^ - X + 1,X^ + 1), 


corresponding to the second line of Theorem 2.5 with n = 2, and g{x) in place of /(x); we 
omit the details. □ 
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2.4. Multiple fc-tuples with small modulus. One could continue as in Propositions 2.4 


and |2.10| to classify the multiple fc-tuples for fc = 5, 6,..., but as the proof of Proposition |2. 10 
shows, this quickly becomes cumbersome. A more practical means of identifying relatively 
dense arithmetic progressions N(P) of nodes giving rise to loops is to do a direct search for 
small values of \P\. 

One procedure for hnding all multiple /c-tuples of a given modulus is as follows. Suppose 
that m is a squarefree positive integer (our candidate for |P|), and rewrite the system of 
congruences in Dehnition 2.1 as 


(2.10) pi-■■ Pi -1 = di {mod Pi), 
where di,... ,dk are proper divisors of m satisfying 

(2.11) di 7 ^ dj and mm{di,dj) \ max.{di,dj) 

for all i ^ j. (If we wish to hnd only irreducible pairs, then we impose the further constraint 
di ^ Pi- ■ -Pi-i-) We search for solutions to (2.10) recursively: suppose that pi,... ,Pi-i and 
di,..., di-i have been chosen, loop over all proper divisors di of m such that (2.11) holds for 
all j < i, and then over all primes Pi \ ^ such that (2.10) holds. Since (2.10) is a very 


restrictive condition, most branches of the search tree are pruned quickly, so this method is 
substantially more efficient than naively trying all permutations of the prime factors of m. 

We coded this procedure and used it to hnd 195167 (unordered) irreducible pairs of mod¬ 
ulus \P\ < 10®. The results reveal that for large k, topologies that are much more intricate 
than the simple loops observed in Propositions |2.4 and 2.10 can arise. For instance, for any 
n = 58183403 (mod 635825190), has a subgraph as shown in Figure]^ in which there 
are 7 paths between n and 635825190?7,, 12 out of the 21 pairs of paths are irreducible, and 
there are subloops of heights 3, 4, 5, 6 and 8. 

Note that only pairs of modulus co-prime to 2-3-7-43 can possibly appear in Gi. Imposing 
that restriction reduces the list to just 18 moduli |P| < 10® with 42 associated arithmetic 
progressions N{P), as shown in Table 2.3 Consider, for instance, the progressions with 
modulus 115908845 = 5-13-23-31-41-61. It is known (see the introduction of [1]) that none 
of these primes can occur as an edge of the right-most branch of Gi (sequence A000946). 
Therefore, it seems natural to expect the nodes of the right-most branch to vary randomly 
among the invertible residue classes mod 115908845 as the level increases, in the sense that 
each residue class should occur with equal frequency. (This is the same heuristic reasoning as 
that supporting Shanks’ conjecture [TB] that the hrst Euclid-Mullin sequence contains every 


prime.) Thus, we would expect one of the four corresponding residue classes in Table 2.3 
to occur with frequency 4/93(115908845) = 1/19008000. In particular, we are led to the 
following conjecture: 

Conjecture 2.17. Gi contains infinitely many loops. 


More generally, it seems likely that each of the residue classes N{P) in Table 2.3 will 
be met inhnitely often by the nodes of Gp, we provide some evidence towards this in the 
next section. It is difficult to compute the overall probability of a random node on the 
graph landing in one of the residue classes, since these events are not independent, i.e. the 
classes overlap in non-trivial ways. However, it is apparent from the hrst few lines of the 
table that the greatest chance of hnding a loop comes from the progressions of modulus 
2813785 = 5 ■ 13 ■ 73 ■ 593, with density 2/93(2813785) = 1/1022976. Thus, on the sub-graph 
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Figure 2. Some nodes of Gn for n = 58183403 (mod 635825190) 

of nodes co-prime to 2813785, we expect roughly one out of every million nodes to be the 
base of a loop of height 4. 


3. Numerical results 


We used two methods for exploring Gi numerically. First, we used freely available software 
implementations of the elliptic curve method (see GMP-ECM 0 ) and general number held sieve 
(see YAFU [6], msieve [5] and GGNFS |1]) to compute as many nodes as was practical for levels 
up to 17. This was a community effort, with support from users of mersenneforum.org. 

Table 3T lists the number of nodes that we have computed at each level of the graph Gi. 
The hnal column is the number of remaining unfactored composites at that level. Factoring 
a composite at a given level will increase the number of nodes at that level (by at least 2) 
and all subsequent levels. The single remaining composite at level 13 is the 253-digit: 


30741638041263757309600460000064107032998604910525153993522043894945654246227689310806- 

05652579832748915879865519993669161314951649593763245464995966627308199534468607184384- 

744257573685683221611440202806222725727083224756010635164700144499225512799343807. 


We have been unable to factor this number despite running GMP-ECM on approximately 
225,000 curves at Bi = 8.5 x 10® and 44,000 curves at i?i = 3 x 10® (and default B 2 values); 
this is a level of effort comparable to a “t70”, meaning it has a reasonable chance of revealing 
any prime factors with up to 70 digits. 
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\P\ 

a{P) 

1 / density 

{5,13,73,593} 

2813785 

1125513,1861426 

1022976 

{5,11,13,79,523} 

29541655 

2913109, 19876614 

9771840 

{5,13,17,53,563} 

32972095 

45473, 14501753, 15173846, 15665474 

5611008 

{5,11,23,31,1307} 

51254005 

29374824, 37354844 

17239200 

{5,13,23,31,41,61} 

115908845 

30432518, 43262953, 74975328, 87805763 

19008000 

{197,211,2969} 

123412423 

114015537 

122162880 

{5,11,23,67,1831} 

155186405 

92870549 

106286400 

{5,13,19,23,71,89} 

179491195 

106001778, 120823468, 140339224, 145796156 

29272320 

{5,13,29,61,1597} 

183631045 

16718992, 26947777, 40801752, 51030537 

32175360 

{5,11,13,41,73,113} 

241819435 

31106978, 108457851 

77414400 

{5,11,13,17,97, 233} 

274715155 

161397329, 273388114 

85524480 

{5,11,13,733,773} 

405125435 

26064013, 332551556 

135624960 

{5,11,19,41,83,127} 

451629145 

16717776, 363759119 

148780800 

{5,13,19,23,37,449} 

471892265 

331562178, 399028904 

153280512 

{5,19,53,337,421} 

714350695 

171690041, 516232304 

264176640 

{5,11,19,53,71,199} 

782534665 

504298018, 599617009 

259459200 

{11,19,29,71,1871} 

805149301 

159790883, 664072158 

329868000 

{5,11,23,61,67,167} 

863399185 

132876677, 201396989 

289238400 

Table 2.3. Mul 

tiple fc-tuples of modulus P < 10® with ( P , 2 ■ 3 ■ 7 ■ 43) = 

= 1 


level 

nodes 

composites 

level 

nodes 

composites 

< 4 

1 

0 

11 

555 

0 

5 

2 

0 

12 

2020 

0 

6 

4 

0 

13 

7948 

1 

7 

9 

0 

14 

32738 

8 

8 

24 

0 

15 

141619 

636 

9 

52 

0 

16 

622317 

13445 

10 

165 

0 

17 

2550301 

186060 


Table 3.1. Number of nodes by level in Gi. 


The numbers appearing in Theorem 1.1 were found by checking our data for the residue 


classes in Table 2.3 If there is a lower node with multiple paths to 1 then there must be a 
loop of height k starting from some node of level < 20 — A;, viz. at most 17 if /c = 3 and 16 if 
k> A. Although we have not been able to compute the full graph up to level 17, we expect 
that there are no more than one million nodes remaining to be found up to that level, with 
at most 50 thousand of those at level 16 or below (and fewer still that are co-prime to 5 and 
13). In view of Table 2.3, it seems unlikely that one of those will yield a loop. However, that 
cannot be established definitively until the full graph is computed up to level 18, which is 
out of reach with present technology. 

Our second numerical method aimed to produce large quantities of nodes rather than a 
comprehensive list of all of them. We began with our list of nodes at level 17 and followed 
only the edges corresponding to primes below some bound B. Taking B = 2^^ and computing 


up to level 50, we found at least one match to every congruence class listed in Table |2.3 
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7 Xfc + l/Xfc 


9 

0.748 

10 

0.752 

11 

0.776 

12 

0.803 

13 

0.808 

14 

0.825 


Table 4.1. Estimated values of 


in particular we found loops of heights 3, 4, 5 and 6. This method was also helpful for 
investigating some other statistical questions, as we describe in the next section. 

4. Related questions 


In this hnal section, we record some 
questions: 


numerical observations and heuristics on related 


(4.1) 


Does every prime occur as an edge in Gi ? 


This seems very likely. 

|9 


With the second 


method described above, we verihed that every prime below 10^ occurs. 

How does the number of nodes at level k grow asymptotically as k ^ oo? Let Xk 
denote the number of nodes of Gi of level k. Heuristically, based on the Erdds-Kac 
theorem, we expect that for a typical node n, n + 1 will have about log log n prime 
factors, with the values of uniformly distributed on [0,1] as p varies over 

the prime factors of n + 1. 

Let Ufc be the nodes of a typical path in Gi, with uq = 1, and dehne 9k so that 

= exp([log(nfc + 1)]®'“). 

Then by the above heuristic, 6k should vary uniformly over [0,1] as A: —)• oo. If we 
instead treat the 9k formally as independent, uniform random variables on [0,1] and 
dehne by (4.1), then it is not hard to see that 

log log Uk 


lim 

k^oo 




= 1 


holds almost surely. Thus, we might expect the typical node of level k to be of size 
exp exp([l+o(l)]A/^). (This analysis ignores the fact that Uk+l is co-prime to Uk and 
hence typically has no small prime factors; however, in the random model, the bulk 
of the contribution to log log comes from the values of 9k close to 1, corresponding 
to the large prime factors, so this makes little difference.) In turn, this leads to the 
conjecture that = (1 + o(l))\/^, or equivalently logXfc = |(log ^ -f o(l)). 

As far as we are aware, it i s not even known that Xk is unbounded, so this remains 
largely guesswork. Table 4.1 shows estimated values of for k < 14, based on 


the data in Table 3.1 Although the data are very limited, they are at least consistent 


with the above guess, in that the ratio appears to grow slowly towards 1. 

Are there arbitrarily long chains of nodes with only one child each? This is related to 
the previous two questions. The basic heuristic underlying Shanks’ conjecture is that 
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the nodes of a given path in Gi should vary randomly among the invertible residue 
classes modulo a hxed prime p, until p occurs as an edge (beyond which every node is 
divisible by p). One (perhaps the only) conceivable way in which this heuristic might 
fail is if n + 1 is prime for every node n of sufficiently large level along the path. In 
fact, as discovered by Kurokawa and Satoh [S], that can happen for the analogous 
question over Fp[a;]. 

All numerics to date indicate that this pathology does not occur over Z, but it 
is an interesting question whether there are arbitrarily long chains in Gi of nodes 
n such that n + 1 is prime. For a random node n, we can estimate the probability 
that n + 1 is prime as so the chance that there is a unique path of length ^ 

descending from n is roughly 

n n n I n 

_ ^ ^ ^ ^ I _ 

(p(n)logn (p(n)log(n2) ip{n)\og{n^^~") \(p(n)2^1ogn 

As above, we expect the n of level k typically satisfy logn = Hence, if our 

asymptotic guess for holds then we should indeed expect chains of length i to 

occur for sufficiently large k, and in fact we might expect i as large as about 2 ■ 
By our second method we found several examples of nodes followed by a unique path 
of length 4; the lowest (after the root node 1) is the following node at level 20: 

2 ■ 3 ■ 7 ■ 43 ■ 139 ■ 50207 ■ 1607 ■ 38891 ■ 71609249149971437 ■ 97272377313541 ■ 318004829 
■ 1555110880896883 ■ 39807662109343 ■ 53437 ■ 35251 ■ 79 ■ 2011283825921 ■ 29 ■ 17 ■ 241. 


j >( (log n) 


(4.2) 


Is Gi planar? Our search for multiple fc-tuples of small modulus uncovered several 
arithmetic progressions of n, e.g. n = 93397 (mod 510510), such that Gn is not planar. 
As a generalization of Theorem ]_T, it is a natural question whether Gi itself is planar. 
However, despite making an extended search, every progression that we found leading 
to non-planar graphs had modulus divisible by 6, and it is unclear whether or not 
that is a necessary condition. In any case, if Gi is non-planar, that fact is likely not 
manifested until astronomically large level, so this question is unlikely to be settled 
in the near future. 

How does the number of irreducible pairs of modulus < X grow asymptotically as 
X ^ 00 ? The proof of Theorem 2.15 shows that, for large X, 

#{g G Z n [1, X] : 3 an irreducible pair of modulus q} S> 


and this gives a lower bound for the number of irreducible pairs of modulus up to X. 
However, a log-log £t of our data up to 10® suggests that this is too low, and that (4.2) 
is perhaps asymptotic to cX®/® for some c > 0. Note that the moduli exhibited in 
the lower bound in ( 4.2[ ) are all even (for odd moduli the proof of Theorem 2.15 gives 
only a lower bound ^ X^/^); our numerics also suggest that almost all irreducible 
pairs have even modulus. 


Finally, we record the latest results on the computation of the Euclid-Mullin sequence and 
some of its relatives. Let denote the hrst Euclid-Mullin sequence starting with the prime 
n, i.e. the edges of the left-most path in Gn- Wagstaff na computed M 2 up through the 
43rd term (180 digits). Much computation effort, including several large GNFS world-wide 
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n 

P 

step 

digits 

OEIS 

n 

P 

step 

digits 

OEIS 

2 

41 

52 

335 

A000945 

47 

23 

36 

194 

A051319 

5 

31 

58 

347 

A051308 

53 

71 

92 

526 

A051320 

11 

29 

56 

313 

A051309 

59 

37 

79 

1059 

A051321 

13 

17 

58 

353 

A051310 

61 

29 

47 

501 

A051322 

17 

37 

31 

232 

A051311 

67 

19 

43 

200 

A051323 

19 

43 

73 

922 

A051312 

71 

79 

140 

991 

A051324 

23 

29 

62 

515 

A051313 

73 

83 

131 

949 

A051325 

29 

67 

80 

566 

A051314 

79 

17 

32 

292 

A051326 

31 

29 

38 

240 

A051315 

83 

71 

65 

296 

A051327 

37 

59 

77 

826 

A051316 

89 

79 

79 

743 

A051328 

41 

43 

56 

933 

A051317 

97 

53 

52 

261 

A051330 


Table 4.2. Summary of for n < 100. 


distributed efforts, has since been expended on factoring the integers needed to extend the 
sequence. In 2012, Ryan Propper found a 75-digit factor using ECM; it remains the hfth 
largest factor ever produced by ECM. 

Table 4^ is a summary of known computational results for the distinct sequences with 
n < 100. The ‘p’ column is the smallest prime not yet conhrmed as a member of the 
corresponding sequence. The ‘step’ column indicates the number of known terms and the 
‘digits’ column the number of decimal digits in the unfactored composite needed for the next 
step. The hnal column is the corresponding entry number in the OEIS. It is unlikely that 
any of the blocking composites has a factor of less than 45 digits. 
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